142 research outputs found

    Les technologies de la parole et du TALN pour l'assistance à domicile des personnes âgées : un rapide tour d'horizon (Quick tour of NLP and speech technologies for ambient assisted living) [in French]

    No full text
    National audiencePour relever le défi du maintien à domicile de la population vieillissante, une des solutions retenues par les pays industrialisés est le développement massif des Technologies de l'Information et de la Communication (TIC). Les TIC représentent une opportunité importante pour améliorer la vie quotidienne des personnes âgées afin qu'elles soient toujours maitresses de leurs choix et qu'elles utilisent la technologie pour continuer à vivre de manière autonome, à apprendre et à s'investir dans la vie sociale. Les technologies du traitement du langage naturelle et de la parole qui se trouvent au cœur de la communication humaine, ont donc un rôle significatif à jouer. Dans cet article nous dressons un tour d'horizon des technologies du TALN et du traitement de la parole actuellement développées dans ce cadre et des verrous ou écueils techniques ou éthiques qui peuvent limiter leur impact

    Gender Representation in Open Source Speech Resources

    Full text link
    With the rise of artificial intelligence (AI) and the growing use of deep-learning architectures, the question of ethics, transparency and fairness of AI systems has become a central concern within the research community. We address transparency and fairness in spoken language systems by proposing a study about gender representation in speech resources available through the Open Speech and Language Resource platform. We show that finding gender information in open source corpora is not straightforward and that gender balance depends on other corpus characteristics (elicited/non elicited speech, low/high resource language, speech task targeted). The paper ends with recommendations about metadata and gender information for researchers in order to assure better transparency of the speech systems built using such corpora.Comment: accepted to LREC202

    Gender Representation in French Broadcast Corpora and Its Impact on ASR Performance

    Full text link
    This paper analyzes the gender representation in four major corpora of French broadcast. These corpora being widely used within the speech processing community, they are a primary material for training automatic speech recognition (ASR) systems. As gender bias has been highlighted in numerous natural language processing (NLP) applications, we study the impact of the gender imbalance in TV and radio broadcast on the performance of an ASR system. This analysis shows that women are under-represented in our data in terms of speakers and speech turns. We introduce the notion of speaker role to refine our analysis and find that women are even fewer within the Anchor category corresponding to prominent speakers. The disparity of available data for both gender causes performance to decrease on women. However this global trend can be counterbalanced for speaker who are used to speak in the media when sufficient amount of data is available.Comment: Accepted to ACM Workshop AI4T

    Construction faiblement supervisée d'un phonétiseur pour la langue Iban à partir de ressources en Malais

    No full text
    International audienceThis paper describes our experiments and results on using a local dominant language in Malaysia (Malay), to bootstrap automatic speech recognition (ASR) for a very under-resourced language : iban (also spoken in Malaysia on the Borneo Island part). Resources in iban for building a speech recognition were nonexistent. For this, we tried to take advantage of a language from the same family with several similarities. First, to deal with the pronunciation dictionary, we proposed a bootstrapping strategy to develop an iban pronunciation lexicon from a Malay one. A hybrid version, mix of Malay and iban pronunciations, was also built and evaluated. Following this, we experimented with three iban ASRs ; each depended on either one of the three different pronunciation dictionaries : Malay, iban or hybrid.Cet article décrit notre collecte de ressources pour la langue iban (parlée notamment sur l'île de Bornéo), dans l'objectif de construire un système de reconnaissance automatique de la parole pour cette langue. Nous nous sommes plus particulièrement focalisés sur une méthodologie d'amorçage du lexique phonétisé à partir d'une langue proche (le malais). Les performances des premiers systèmes de reconnaissance automatique de la parole construits pour l'iban (< 20% WER) montrent que l'utilisation d'un phonétiseur déjà disponible dans une langue proche (le malais) est une option tout à fait viable pour amorcer le développement d'un système de RAP dans une nouvelle langue très peu dotée. Une première analyse des erreurs fait ressortir des problèmes bien connus pour les langues peu dotées : problèmes de normalisation de l'orthographe, erreurs liées à la morphologie (séparation ou non des affixes de la racine)

    USING MALAY RESOURCES TO BOOTSTRAP ASR FOR A VERY UNDER-RESOURCED LANGUAGE: IBAN

    No full text
    International audienceThis paper describes our experiments and results on using a local dominant language in Malaysia (Malay), to boot- strap automatic speech recognition (ASR) for a very under- resourced language: Iban (also spoken in Malaysia on the Borneo Island part). Resources in Iban for building a speech recognition were nonexistent. For this, we tried to take ad- vantage of a language from the same family with several similarities. First, to deal with the pronunciation dictionary, we proposed a bootstrapping strategy to develop an Iban pronunciation lexicon from a Malay one. A hybrid version, mix of Malay and Iban pronunciations, was also built and evaluated. Following this, we experimented with three Iban ASRs; each depended on either one of the three different pronunciation dictionaries: Malay, Iban or hybrid

    Speech Recognition of Aged Voices in the AAL Context: Detection of Distress Sentences

    No full text
    International audienceBy 2050, about a third of the French population will be over 65. In the context of technologies development aiming at helping aged people to live independently at home, the CIRDO project aims at implementing an ASR system into a social inclusion product designed for elderly people in order to detect distress situations. Speech recognition systems present higher word error rate when speech is uttered by elderly speakers compared to when non-aged voice is considered. Two specialized corpora in French, AD80 and ERES38, were recorded in this framework by aged people, they were used first to study the possibility of adaptation of standard ASR to aged voice. Then we looked at whether the variability of the WER between speakers could be correlated with the level of dependence. Then, we assessed the performance of distress sentence detection by a filter and we demonstrated a significant drop in performance for those with the lowest degree of autonomy

    Analysing the Performance of Automatic Speech Recognition for Ageing Voice: Does it Correlate with Dependency Level?

    No full text
    International audienceAmbient Assisted Living aims at providing assistance by allowing people with special needs to perform tasks which they have increasing difficulty with and to provide reassurance through surveillance in order to detect distress and accidental falls. Aged people are among the ones who might benefit from advances in ICT to live as long as possible in their own home. Voice-base smart home is a promising way to provide AAL, but even mature technologies must be evaluated from the perspec- tive of its potential beneficiaries. In this paper, we investigate which characteristics of the ageing voice that challenge a state of the art ASR system. Though in the literature, chronological age is retain as the sole factor predicting decrease in performance, we show that degree of loss of autonomy is even more correlated to ASR performance

    Development of Automatic Speech Recognition Techniques for Elderly Home Support: Applications and Challenges

    Get PDF
    International audienceVocal command may have considerable advantages in terms of usability in the AAL domain. However, efficient audio analysis in smart home environment is a challenging task in large part because of bad speech recognition results in the case of elderly people. Dedicated speech corpora were recorded and employed to adapted generic speech recog-nizers to this type of population. Evaluation results of a first experiment allowed to draw conclusions about the distress call detection. A second experiments involved participants who played fall scenarios in a realistic smart home, 67% of the distress calls were detected online. These results show the difficulty of the task and serve as basis to discuss the stakes and the challenges of this promising technology for AAL

    In-home detection of distress calls: the case of aged users

    No full text
    International audienceIn the context of technologies development aiming at helping aged people to live independently at home, the CIRDO1 project aims at implementing an ASR system into a social inclusion product designed for elderly people in order to detect distress situations and provide capability to call for help. In this context we present a system able to detect distress and call for help sentences on line

    Contribution à l'étude de la variabilité de la voix des personnes âgées en reconnaissance automatique de la parole (Contribution to the study of elderly people's voice variability in automatic speech recognition) [in French]

    No full text
    National audienceL'utilisation de la reconnaissance vocale pour l'assistance à la vie autonome se heurte à la difficulté d'utilisation des systèmes de RAP qui ne sont pas prévus à la base pour la voix âgée. Pour caractériser les différences de comportement d'un système de reconnaissance entre les personnes âgées et non-âgées, nous avons étudié quels sont les phonèmes les moins bien reconnus en nous basant sur le corpus AD80 que nous avons enregistré. Les résultats montrent que certains phonèmes tels que les plosives sont plus spécifiquement affectés par l'âge. De plus nous avons recueilli le corpus spécifique ERES38 afin d'adapter les modèles acoustiques, avec pour résultat une diminution du taux d'erreur de mot de 15%. Malgré la grande variabilité des performances, nous avons caractérisé comment la baisse des performances du système de reconnaissance automatique de la parole peut être corrélée avec la baisse d'autonomie des personnes âgées
    • …
    corecore